Using new indices to predict metabolism dysfunction-associated fatty liver disease (MAFLD): analysis of the national health and nutrition examination survey database

Background Metabolism dysfunction-associated fatty liver disease (MAFLD), is the most common chronic liver disease. Few MAFLD predictions are simple and accurate. We examined the predictive performance of the albumin-to-glutamyl transpeptidase ratio (AGTR), plasma atherogenicity index (AIP), and serum uric acid to high-density lipoprotein cholesterol ratio (UHR) for MAFLD to design practical, inexpensive, and reliable models. Methods The National Health and Nutrition Examination Survey (NHANES) 2007–2016 cycle dataset, which contained 12,654 participants, was filtered and randomly separated into internal validation and training sets. This study examined the relationships of the AGTR and AIP with MAFLD using binary multifactor logistic regression. We then created a MAFLD predictive model using the training dataset and validated the predictive model performance with the 2017–2018 NHANES and internal datasets. Results In the total population, the predictive ability (AUC) of the AIP, AGTR, UHR, and the combination of all three for MAFLD showed in the following order: 0.749, 0.773, 0.728 and 0.824. Further subgroup analysis showed that the AGTR (AUC1 = 0.796; AUC2 = 0.690) and the combination of the three measures (AUC1 = 0.863; AUC2 = 0.766) better predicted MAFLD in nondiabetic patients. Joint prediction outperformed the individual measures in predicting MAFLD in the subgroups. Additionally, the model better predicted female MAFLD. Adding waist circumference and or BMI to this model improves predictive performance. Conclusion Our study showed that the AGTR, AIP, and UHR had strong MAFLD predictive value, and their combination can increase MAFLD predictive performance. They also performed better in females.

invasive procedure with potential negative effects [7].Thus, it is of high clinical importance and value to investigate practical, straightforward, and reliable predictors of fatty liver disease.
The serum uric acid to high-density lipoprotein cholesterol ratio (UHR) is a recently proposed inflammatory marker that has been shown to be associated with the development of NAFLD, metabolic syndrome, diabetes mellitus, insulin resistance, and cardiovascular risk [8][9][10].The plasma atherogenicity index (AIP), defined as the logarithm of the triglyceride to high-density lipoprotein cholesterol (HDL) ratio (TG/HDL-C), is significantly elevated in patients with fatty liver disease and may be a potential indicator for identifying fatty liver disease [11].It has been previously shown to be associated with NAFLD [12], MAFLD [13,14], cardiovascular risk [15], and metabolic risk [16].Albumin (ALB)/alkaline phosphatase (ALP), a biological measure of liver function, has been shown in prior research to be a reliable independent predictor of NAFLD and MAFLD [17].Intrahepatic cholestasis may be involved in the development of NAFLD or MAFLD [18].In contrast, sludgy hepatitis can be reflected by direct bilirubin, glutamyl transpeptidase (GGT) and ALP.Albumin transports bilirubin and cholesterol are an important indicators of liver function.Currently, an important and practical indicator used to assess liver function in liver cancer is the ALBI (albuminbilirubin) score [19], which can be used to predict cirrhosis in the loss-of-compensation phase [20].
Although NAFLD and MAFLD share commonalities, the diagnostic criteria are significantly different, and thus, many predictors of NAFLD still need to be further explored in MAFLD.Consequently, we examined the predictive ability of the AGTR, UHR, AIP and their combination for MAFLD using the National Health and Nutrition Examination Survey (NHANES) 2007-2018 dataset, to design a practical, inexpensive, and reliable predictive tool for MAFLD.

Database
Data were obtained from the NHANES database, which uses a complex, hierarchical, multistage, probabilistic clustering design to assess health and nutritional status in the U.S. All participants provided written informed consent.

Definitions and inclusion criteria
The analysis included subjects 18 years of age or older and included demographics (age, sex, race, poverty-toincome ratio), triglycerides, HDL, blood uric acid, GGT, albumin, data relevant to the diagnosis of MAFLD, and a validated FibroScan.After excluding participants with no key biochemistry data (blood uric acid, triglycerides, glycosylated hemoglobin (HbA1c), HDL, GGT, albumin), incomplete transient elastography data and data that were not diagnostic of MAFLD, a total of 12,654individuals were finally enrolled.The NHANES database 2007-2016 cycle dataset ultimately included 12,654 subjects for statistical analyses and predictive modeling, and predictive model validation was performed using the 2017-2018 cycle dataset, with inclusion criteria and a participant stratification algorithm, as shown in Fig. 1.

Definition of the ending variable MAFLD
The diagnosis of MAFLD is based on histologic (biopsy) imaging or blood biomarker evidence of hepatic fat accumulation (hepatic steatosis) and one of the following three criteria: overweight/obesity, the presence of type-2 diabetes mellitus (T2DM), or evidence of metabolic derangement.Patients with fatty liver are identified by United States Fatty Liver Index (USFLI) ≥30 in the NHANES 2007-2016 dataset or by controlled attenuation parameter (CAP) > 274 [21] in the 2017-2018 dataset and have two of the following items [1]: (i) waist circumference (WC) > 102 cm for men or > 88 cm for women; (ii) blood pressure 130/85 mmHg or related medication; (iii) fasting plasma triglycerides > 1.70 mmol/L or related medication; (iv) plasma high-density lipoprotein (HDL) cholesterol < 1.0 mmol/L for males or < 1.3 mmol/L for females or related medication; (v) preexisting diabetes mellitus (fasting blood glucose 5.6-6.9mmol/L or hemoglobin A1C 39-47 mmol/mol); and (vi) homeostasis model assessment of insulin resistance (HOMA-IR) score) ≥2.5.

Other variable definitions
Participants with at least one of the following conditions are defined as having diabetes in the NHANES: 1. diagnosed with diabetes by a self-reported prior physician or currently being treated for glycemic control (use of insulin or oral hypoglycemic agents); and 2. laboratory results met the following criteria: 1) glycated hemoglobin ≥ 6.5% and 2) fasting blood glucose > 7.0 mmol/L.Hypertension was defined as self-reported physician-diagnosed hypertension or being on prescribed medication.Blood pressure was assessed using an average of 3 consecutive standardized blood pressure readings.Alcohol consumption was categorized as moderate, excessive and no alcohol consumption [23].Physical activity was defined as follows: 1.Light activity; 2. moderate activity; and 3. High-intensity activity [24].A "smoker" was defined as an adult who had smoked 100 cigarettes in his or her lifetime, and a "never smoker" was defined as any adult who had never smoked or had smoked fewer than 100 cigarettes in his or her lifetime.Homeostasis model assessment of insulin resistance (HOMA-IR) score = (fasting insulin in mIU/mL) × (fasting glucose in mg/dL)/405 [25].Total protein intake and vitamin C were extracted from 2-day dietary interview data and averaged over two days.NCHS Ethics Review Board supported the research.Furthermore, written informed consent was received from each subject [26].

Statistical analysis
The study is consistent with the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement [27].NHANES uses a complex survey design to ensure national representativeness, and data analysis of the complex sampling design was conducted under the guidance of the NCHS.Normally distributed variables are expressed as means (standard deviations) and non-normally distributed variables are expressed as medians (quartiles).Categorical variables are shown as unweighted counts (weighted %).Categorical variables were tested with weighted chi-square tests, continuous variables were tested using t tests for normally distributed data, and nonnormally distributed data were tested using Wilcoxon rank sum tests.Weighted univariate and multivariate logistic regression models were used to identify associations between the study variables and the outcome variable (MAFLD), and the data were displayed as odds ratios (ORs) and 95% confidence intervals (CIs) for unadjusted, partially adjusted, and fully adjusted confounders.The confounders for the partial adjustment are: age, gender, race, and income-poverty ratio, and the full adjustment adds the following factors to the partial adjustment: BMI, physical activity, diabetes mellitus, alkaline phosphatase, mercury, cadmium, transaminases, smoking, drinking, protein intake, vitamin C and LDL.We considered two-sided P values less than 0.05 as indicative of statistical significance.For model development, the NHANES database 2007-2016 cycle dataset (12,654 participants in total) was randomly divided into two groups (8,858 for the training dataset and 3,796 for the internal validation dataset) in a 7:3 ratio.The training dataset was used to develop the model, internal validation was performed using the validation dataset, and secondary validation was performed using the dataset from the NHANES 2017-2018 cycle.The R programming language was used for all data extraction and statistical analyses (R version 4.1.2).The strategy for dealing with missing values of covariates in this study: if the number of missing < 20% used multiple interpolation (mi packages for interpolation), more than 20% were excluded from the data.Missing values for study variables as well as key variables for the diagnosis of MAFLD were simply excluded.Use the gtsummary package to construct the output of the prediction model.The qROC package plots the ROC curve along with the output AUC value.

Baseline Characteristics of the Subjects
The baseline characteristics of the subjects are shown in Table 1 to Table 4.In the Tables 1 and 2, of the 12,654 subjects, 4020 patients were diagnosed with MAFLD (32%), with a median age of 53 years, and 8634 were non-MAFLD patients (68%), with a median age of 43 years.ALT, TC, ALP, CRE, BMI, WC, AIP, and UHR were higher in MAFLD patients than in non-MAFLD patients (p < 0.05).Tables 3 and 4 show that the training set's baseline information is similar to the internal and secondary validation sets.

Predictive results of AGTR, UHR, and AIP on MAFLD
Using the test set extracted from five cycles of NHANES data from 2007-2016 for testing the predictive model, the   To further validate the predictive ability of the model, the entire 2017-2018 cycle dataset was used for the secondary validation of the model, and the results are shown in Table 6 and Fig. 2B.In the secondary validation of the model, the ability of the three to jointly predict MAFLD (AUC = 0.775) was similar to that of the AIP (AUC = 0.743) individually, but they were all stronger than the predictive ability of the AGTR and UHR individually for MAFLD.
To further distinguish the predictive ability of the above models for MAFLD, we also analyzed age(18 ≤ age < 65 and ≥ 65), BMI (< 25 kg/m2 and ≥ 25 kg/m2), sex(female and male), diabetic and race(Mexican American, Other Hispanic, Non-Hispanic White, Non-Hispanic Black and Other Race) populations in further subgroups, and the results (the internal validation and second validation's roc curve for the subgroup) are shown in (Figs. 3, 4 and 5) and (Figs 6, 7 and 8).In the subgroups, the combined prediction outperformed the three models independently, with the above model performing better for MAFLD prediction in female, nonoverweight and mexican american patients.

Comparison of the combined model and available models for predicting MAFLD
In order to complement and improve the new model, we compared the established model with the existing model (A-W-B) [14] and added waist circumference and or body mass index (BMI) to the new model as a supplement.Delong Test was used to compare the models and the results of the analysis are shown in

Discussion
The results of this study showed that a greater AGTR was beneficial in reducing the risk of developing MAFLD, with an OR of 0.31, indicating that for each unit increase, the risk of developing MAFLD was reduced by 69%, which is indicative of a strong independent protective factor, However, the results are based on cross-sectional    studies, and multifactorial logistic regression analyses may be biased for non-rare diseases in cross-sectional studies [34].Albumin is a biologically active substance synthesized by the liver and a marker of liver function with many biological functions.It is the most abundant plasma protein in human blood, transporting metals, fatty acids, cholesterol, bile pigments, and drugs, and it is also the main antioxidant in body fluids, playing an important anti-inflammatory role in inflammatory oxidative stress [35].In the present study, albumin levels were lower in the MAFLD population than in the non-MAFLD population, which may be associated with the involvement of inflammation in the development of MAFLD, and lipid accumulation in the liver promotes the progression of hepatic inflammation [4], making the albumin level low in the MAFLD population.Albumin binds to free fatty acids and reduces the levels of free fatty acids, which are one of the important triggers of insulin resistance, and increased levels of free fatty acids can lead to deterioration of insulin sensitivity, while induction of tissue oxidative stress can lead to tissue insulin resistance [36].Glutamyl transpeptidase (GGT) has good sensitivity for the diagnosis of NAFLD and is one of the indicators that make up the Fatty Liver Index (FLI), which is able to participate in the metabolic process of the glutathione antioxidant system; thus, GGT can be elevated in inflammatory states.It has been shown that GGT also increases the risk of insulin resistance, which is considered an important developmental factor in MAFLD [37].Albumin has a negative correlation on GGT's.On the one hand, when albumin level decreases, free fatty acids are elevated, which will stimulate the synthesis and release of GGT [38].On the other hand, the anti-inflammatory effect of albumin, which will inhibit the occurrence of oxidative stress, plays a protective role in the liver, thus reducing the risk of MAFLD.In addition, GGT is also an important indicator reflecting intrahepatic cholestasis, and the state of intrahepatic microcholestasis is involved in the development of MAFLD, so an elevated level of GGT or a decreased level of albumin will increase the risk of the development of MAFLD.We hypothesized that the AGTR would have predictive value for MAFLD.Our conjecture was revealed in both internal and secondary validation, showing that the AGTR was significantly better than the UHR in predicting MAFLD in the diabetic population, which indicated that it may be a potential inflammatory marker after the UHR and a more accessible and accurate predictive indicator for MAFLD patients.However, while previous studies have shown that 1/AGTR can be used as an independent predictor of coronary artery disease [39,40], there are still few studies on the AGTR, and its predictive value in MAFLD or NAFLD has not yet been explored.Our study is the first to use the AGTR as a predictor of MAFLD and has emphasized the role of UHR is a relatively recent and novel marker of inflammation, consisting of uric acid as well as HDL, and it has been shown that high levels are associated with high abdominal visceral fat (VFA), which is associated with central obesity, a risk factor for the development of MAFLD [8,41].In addition, UHR may increase the burden of inflammation and oxidative stress, which indirectly affects the insulin sensitivity of patients, leading to the development of MAFLD.In the present study, the predictive value of UHR for MAFLD was explored, and multiple subgroup analyses showed that the female population had better predictive performance, which may be related to differences in hormone levels between genders.The results are consistent with previous studies [31].
AIP is a marker that responds to lipid metabolism, which is strongly associated with metabolic syndrome Fig. 3 The internal validation roc curve of the subgroup and the occurrence of adverse cardiovascular events; therefore, in this study, we evaluated the relationship between the AIP and MAFLD and demonstrated that the AIP was significantly and positively associated with the risk of developing MAFLD and could be used as a predictor of MAFLD.In the total population, our findings are compatible with a previous meta-analysis [12] showing the beneficial role of the AIP in predicting MAFLD or NAFLD with internal validation and secondary validation showing an AUC > 0.7.In subgroup analyses, the AIP predicted MAFLD better in nondiabetic than in diabetic populations.AIP not only increases the risk of insulin resistance [42], but also leads to disturbances in lipid metabolism.A retrospective study based on a Chinese diabetic population showed that the AIP has a predictive value in the diabetic population [13].However, this study only evaluated the diabetic population and obtained a value of 0.57 for the resultant AUC, which is not a very good predictive performance.Our study is consistent with Fig. 4 The internal validation roc curve of the subgroup Fig. 5 The internal validation roc curve of the subgroup the findings of Duan, Shao-Jie et al. [14].However, we added subgroup analyses of diabetic populations, ethnic populations, which the predictive value of AIP to be validated in a wider population.
To further improve the prediction performance of the prediction model for MAFLD, we combined the AGTR, AIP, and UHR to jointly predict MAFLD, and through the complementary prediction performance of the three, the results showed that in the total population as well as subgroup analysis, the prediction of MAFLD by the combination of the three was stronger than that of the individual predictive ability, and our findings also showed that in the female population, the joint predictive ability of the three for MAFLD was the best among the subgroups.In the female population, the prevalence of NAFLD and MAFLD was lower than that in the male population, and estradiol had an antioxidant effect, whereas in the nonmenopausal female population, estradiol levels were higher than those in the male population, which may be attributed to the protective Fig. 6 The second validation's roc curve for the subgroup effect of estradiol [43].Furthermore, estradiol reduces serum concentrations of GGT, uric acid and triglycerides and indirectly reduces diet-induced fatty liver injury via peroxidase [44][45][46].These factors may explain why the AGTR, AIP, and UHR are better predictors in the female population.
Finally, we also compared the strengths and weaknesses of our model with the A-W-B model, and our study showed that our model supplemented with waist circumference and or BMI parameters was clearly superior to the A-W-B model in the total population as well as in each subgroup.Therefore, when using our model in the clinical setting, the addition of waist circumference or BMI can be a better predictor of MAFLD.Our study also showed that waist circumference is a better predictor of MAFLD than BMI, which may be that waist circumference is more reflective of central obesity [47], which is an important risk factor for MAFLD.Therefore, we recommend prioritizing the use of waist over BMI when screening people for MAFLD [48].The Fig. 7 The second validation's roc curve for the subgroup Fig. 8 The second validation's roc curve for subgroup predictive model based on weighted analysis in this study has metrics that are easy to obtain, less costly than CT, MRI, and other imaging, easy to compute, and conducive to replication in physical exams or hospitalizations in the U.S. population.
Several advantages of this study are worth mentioning.To our knowledge, this is the first study to use the AGTR as a predictor of MAFLD.In addition, this is also the first study to assess the predictive efficacy of the AIP for MAFLD in the NHANES dataset.However, we acknowledge that there are also some limitations in this study, of which three main limitations were observed.First, this study is a cross-sectional study, which prevents us from drawing conclusions about causality.The longitudinal design will make the results more reliable.Second, the modalities we used to diagnose fatty liver were USFLI and transient elastography, and although their accuracy has been widely validated.We may underestimate the prevalence of MAFLD.Therefore, the gold standard is still liver puncture biopsy.Finally, some of the data used in the diagnosis of MAFLD were derived from a questionnaire, and the results may be somewhat biased.We may underestimate the impact of factors such as diet, exercise, and alcohol consumption on predictive markers.More prospective cohort studies are still needed to fully validate our findings.

Conclusion
In conclusion, our study showed that the AGTR, and UHR have strong MAFLD predictive value and their combination can increase the predictive performance, especially in the female population.This study is important for developing personalized MAFLD diagnostic and treatment methods.

Fig. 2 A
Fig. 2 A: The roc for internal validation; B: The roc curve for the second validation

Table 2
Basic characteristics of participants according to MAFLD from NHANES 2017-2018 AGTR Albumin to glutamyl transpeptidase ratio, AIP Plasma atherogenicity index, UHR Serum uric acid to high-density lipoprotein cholesterol ratio 1 n (unweighted)(%); Median (IQR)2chi-squared test with Rao & Scott's second-order correction; Wilcoxon ranksum test for complex survey samples; t-test adapted to complex survey samples

Table 3
Basic characteristics of participants according to train and internal validation set The predictive ability of the first three alone for MAFLD was similar, whereas the combined model was stronger for MAFLD than for MAFLD alone, with the best cutoff value of the combined predictive model being 0.334 (sensitivity = 0.761, specificity = 0.739).

Table 4
Basic characteristics of participants according to train and second validation set We found that adding waist circumference and/or BMI to the joint prediction model improves the performance of the prediction, with excellent prediction performance in the internal test set, AUC > 0.90,and in the secondary validation set, AUC > 0.80.It is worth noting that the

Table 5
Analysis of the correlation between the study variables and MAFLD from NHANES 2007-2016